Skip to content

Conversation

maropu
Copy link
Member

@maropu maropu commented Jun 22, 2016

What changes were proposed in this pull request?

Update a doc because it's stale.
This is a trivial fix, so I didn't create a JIRA ticket.
This fix was suggested in #13802 by @hvanhovell.

How was this patch tested?

N/A

* Indicates if this function needs to aggregate values group-by-group in a single step.
* If false, we must always use a `SortAggregateExec` operator without partial aggregates.
*/
def supportsPartial: Boolean = true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd be also better to rename this variable instead of supportsPartial because it's kinds of misleading; forceSortAggregate, needsSequentialAggregate, ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like the names, they describe however the inverse of what this flag does. We could invert the flag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay. Since the suggested fix changes the code, I'll create a new jira ticket and attach this pr with it.

@maropu
Copy link
Member Author

maropu commented Jun 22, 2016

@hvanhovell ping

@SparkQA
Copy link

SparkQA commented Jun 22, 2016

Test build #61056 has finished for PR 13852 at commit a4d26b3.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jun 24, 2016

@hvanhovell ping

/**
* Indicates if this function supports partial aggregation.
* Currently Hive UDAF is the only one that doesn't support partial aggregation.
* Indicates if this function needs to aggregate values group-by-group in a single step.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also the inverse.

@maropu maropu changed the title [SQL][DOC] Update a description for AggregateFunction#supportsPartial [SPARK-16200][SQL] Rename AggregateFunction#supportsPartial Jun 25, 2016
@SparkQA
Copy link

SparkQA commented Jun 25, 2016

Test build #61215 has finished for PR 13852 at commit 00aac26.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jun 30, 2016

@hvanhovell ping

@rxin
Copy link
Contributor

rxin commented Jul 11, 2016

are we overloading the semantics? I think it's actually useful to have a supportsPartial, which is what this was for.

@maropu
Copy link
Member Author

maropu commented Jul 11, 2016

you mean we need two funcs: supportPartial and forceSortAggregate?

@rxin
Copy link
Contributor

rxin commented Jul 11, 2016

Do we have two requirements here?

One is whether an aggregate function supports partial aggregation, and the other is whether the order should be enforced right ?

@maropu
Copy link
Member Author

maropu commented Jul 11, 2016

Yes. Currently, we have three functions with supportPartial=false: hive_udaf, collect, and window functions. The former two functions cannot support this because we do not have byte-backed mutable and growable ArrayData and MapData. I think, if we implement these data structures, we can set true at supportPartial. However, window functions need ordered evaluation. So, since the name is kinda misleading, I think.

@hvanhovell
Copy link
Contributor

@maropu we are close to merging PR #14753. This will eliminate the need for a per-group (sorted) aggregation approach. On the longer run, it might also eliminate the need for a supportsPartial flag.

Shall we close this one?

@maropu
Copy link
Member Author

maropu commented Aug 25, 2016

okay, thanks!

@maropu maropu closed this Aug 25, 2016
override def inputTypes: Seq[AbstractDataType] = Seq(AnyDataType)

override def supportsPartial: Boolean = false
override def forceSortAggregate: Boolean = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, after changing this name, it will not show that we do not partial agg for this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea. either way, it seems partial agg. becomes meaningless in future.

@maropu maropu deleted the FixDocInAggIf branch July 5, 2017 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants